In this notebook we will compare the biomarker results from 8 different cell lines. These results were obtained through an ensemble model analysis, which was performed on the models generated by the Gitsbe module when running the DrugLogics computational pipeline for finding synergistic drug combinations (drug pairs). The models in each different cell line setup were trained towards a specific signaling activity profile matching the activity state of that particular cell line. For more info about the cell line specific model & biomarker analysis, see the respective notebooks:
Note that the biomarkers identified in the aforementioned notebooks, were classified as:
In each category, the biomarkers can be either in an active state or in an inhibited state. The input for the simulations and the output data are in the cell-lines-2500 directory (the 2500 number denotes the number of simulations executed). The analysis will be presented step by step in the sections below.
Firstly, we load the required libraries (you need to install them if you don’t have them):
library(circlize) # version 0.4.5, see (Gu, 2014)
library(ComplexHeatmap) # version 1.99.5, see (Gu, 2016)
and the relevant helper functions:
# Set the working directory to the gitsbe-model-analysis folder:
# setwd("pathTo/gitsbe-model-analysis")
source("Rscripts/input_functions.R")
source("Rscripts/output_functions.R")
source("Rscripts/analysis_functions.R")
The R version used for this analysis is:
pretty.print.string(R.version.string)
R version 3.5.3 (2019-03-11)
Firstly, we define the necessary input (cell line names, directories and files of interest, etc.):
cell.lines = c("A498", "AGS", "colo205", "DU145", "MDA-MB-468", "SF295",
"SW620", "UACC62")
cell.lines.dirs = sapply(cell.lines, function(cell.line) {
paste0(getwd(), "/cell-lines-2500/", cell.line)
})
biomarkers.dirs = sapply(cell.lines.dirs, function(cell.line.dir) {
paste0(cell.line.dir, "/biomarkers")
})
observed.synergies.files = sapply(cell.lines.dirs, function(cell.line.dir) {
paste0(cell.line.dir, "/observed_synergies")
})
# get the drug combinations tested
model.predictions.file = paste0(cell.lines.dirs[1], "/model_predictions")
model.predictions = get.model.predictions(model.predictions.file)
drug.combinations.tested = colnames(model.predictions)
# get the node names (same topology for all cell lines)
models.dir = paste0(cell.lines.dirs[1], "/models")
node.names = get.node.names(models.dir)
Next, we will visualize the performance biomarkers found from the cell-specific model analysis, per cell line:
biomarkers.perf.active =
get.perf.biomarkers.per.cell.line(biomarkers.dirs, type = "active")
biomarkers.perf.inhibited =
get.perf.biomarkers.per.cell.line(biomarkers.dirs, type = "inhibited")
biomarkers.perf.res =
merge.perf.biomarkers(node.names, cell.lines, biomarkers.perf.active,
biomarkers.perf.inhibited)
# remove nodes which are not biomarkers for any cell line
biomarkers.perf.res = prune.columns.from.df(biomarkers.perf.res, value = 0)
# define a coloring
biomarkers.col.fun = colorRamp2(c(-1, 0, 1), c("tomato", "grey", "gold"))
perf.biomarkers.heatmap =
Heatmap(matrix = as.matrix(biomarkers.perf.res), col = biomarkers.col.fun,
column_title = "Performance biomarkers per cell line",
column_title_gp = gpar(fontsize = 20),
row_title = "Cell Lines", row_title_side = "left",
row_dend_side = "right", row_names_side = "left",
column_names_gp = gpar(fontsize = 5),
rect_gp = gpar(col = "black", lwd = 0.3),
heatmap_legend_param = list(at = c(-1, 1), title = "Activity State",
labels = c("Inhibited", "Active"), color_bar = "discrete", ncol = 2,
title_position = "leftcenter", direction = "horizontal"))
draw(perf.biomarkers.heatmap, heatmap_legend_side = "bottom")
AGS,SW620,colo205) have a lot of common biomarkers. These common biomarkers between the cell lines can be in either the same activity state (SW620 vs AGS) or the reverse (SW620 vs colo205)Each of the cell lines studied has a different set of observed synergies (drug combinations that were found synergistic across all the 153 tested ones). In this section, we will visualize the cell lines’ observed synergies and mark the synergies that were also predicted by the cell-specific models. First, we get the biomarkers for these synergies from each cell line:
synergy.biomarkers.cell.specific =
get.synergy.biomarkers.per.cell.line(biomarkers.dirs)
total.predicted.synergies.cell.specific = character(0)
for (cell.line in cell.lines) {
total.predicted.synergies.cell.specific =
c(total.predicted.synergies.cell.specific,
rownames(synergy.biomarkers.cell.specific[[cell.line]]))
}
total.predicted.synergies.cell.specific =
unique(total.predicted.synergies.cell.specific)
total.predicted.synergies.cell.specific.num =
length(total.predicted.synergies.cell.specific)
Then, we get the observed synergies from each cell line:
observed.synergies.per.cell.line = sapply(
observed.synergies.files, function(observed.synergies.file) {
get.observed.synergies(observed.synergies.file)
})
observed.synergies.res =
merge.observed.synergies(drug.combinations.tested, cell.lines,
observed.synergies.per.cell.line)
# remove drug combinations which are not observed in any of the cell lines
observed.synergies.res = prune.columns.from.df(observed.synergies.res, value = 0)
total.observed.synergies = colnames(observed.synergies.res)
total.observed.synergies.num = length(total.observed.synergies)
Lastly, we visualize the observed and predicted synergies for all cell lines in one heatmap:
# color the cell-specific predicted synergies
predicted.synergies.colors = rep("black", total.observed.synergies.num)
names(predicted.synergies.colors) = total.observed.synergies
predicted.synergies.colors[total.observed.synergies %in%
total.predicted.synergies.cell.specific] = "blue"
# define a coloring
obs.synergies.col.fun = colorRamp2(c(0, 1), c("red", "green"))
observed.synergies.heatmap =
Heatmap(matrix = as.matrix(observed.synergies.res),
col = obs.synergies.col.fun,
column_title = "Observed synergies per cell line",
column_title_gp = gpar(fontsize = 20),
column_names_gp = gpar(col = predicted.synergies.colors),
row_title = "Cell Lines", row_title_side = "left",
row_dend_side = "right", row_names_side = "left",
rect_gp = gpar(col = "black", lwd = 0.3),
heatmap_legend_param = list(at = c(1, 0), labels = c("YES", "NO"),
color_bar = "discrete", title = "Observed", direction = "vertical"))
lgd = Legend(at = "Cell-specific", title = "Predicted",
legend_gp = gpar(fill = "blue"))
draw(observed.synergies.heatmap, heatmap_legend_list = list(lgd),
heatmap_legend_side = "right")
AK-BI, PI-D1)PD-PI, AK-PD in the A498 cell line) were identified only by the cell-specific models (????? MAYBE! HAVE TO CHECK FIRST WITH RANDOM MODEL PREDICTIONS)In this section, we will produce heatmaps showing the biomarkers across the 8 cell lines for every observed synergy that was also predicted by the cell-specific models. First, we manipulate the biomarker result data to fit our purposes:
synergy.biomarkers.cell.specific.res =
arrange.by.synergy(synergy.biomarkers.cell.specific, observed.synergies.res,
total.predicted.synergies.cell.specific, node.names)
# For every synergy, remove cell lines that didn't predict the observed synergies
# at all or for which we couldn't find any biomarkers (row pruning).
# Also remove nodes which are not biomarkers for any cell line (column pruning)
for (synergy in names(synergy.biomarkers.cell.specific.res)) {
df = synergy.biomarkers.cell.specific.res[[synergy]]
df = prune.columns.from.df(df, value = 0)
df = prune.rows.from.df(df, value = 0)
synergy.biomarkers.cell.specific.res[[synergy]] = df
}
# re-order result list based on increasing number of cell lines
synergy.biomarkers.cell.specific.res = synergy.biomarkers.cell.specific.res[
names(sort(sapply(synergy.biomarkers.cell.specific.res, function(x) { nrow(x) })))
]
Next, we produce the heatmaps (one per synergy predicted):
for (synergy in names(synergy.biomarkers.cell.specific.res)) {
biomarkers.res = as.matrix(synergy.biomarkers.cell.specific.res[[synergy]])
# customize some parameters
heatmap.height = ifelse(nrow(biomarkers.res) < 3, 1, 3)
show.column.dend = ifelse(nrow(biomarkers.res) == 1, FALSE, TRUE)
row.title = ifelse(nrow(biomarkers.res) == 1, "Cell Line", "Cell Lines")
biomarkers.heatmap =
Heatmap(matrix = biomarkers.res, col = biomarkers.col.fun,
column_title = paste0("Biomarkers for Synergy: ", synergy),
column_title_gp = gpar(fontsize = 20, fontface = "bold"),
row_title = row.title, row_title_side = "right",
rect_gp = gpar(col = "black", lwd = 0.3),
column_names_gp = gpar(fontsize = 10),
show_column_dend = show.column.dend,
height = unit(heatmap.height, "inches"),
heatmap_legend_param = list(at = c(-1, 1), title = "Activity State",
labels = c("Inhibited", "Active"), color_bar = "discrete",
title_position = "leftcenter", direction = "horizontal", ncol = 2))
draw(biomarkers.heatmap, heatmap_legend_side = "bottom")
}
The main result here is that we observe common biomarkers across many cell lines and in the same activity state for some synergies, which hints that these biomarkers may have a pan-cancer diagnostic value.